British Journal of Anaestliesia 113 (3): 424-32 (2014) 
Advance Access publication 11 April 2014 • doi:10.1093/bja/aeul00 

Reliability of the American Society of Anesthesiologists 
physical status scale in clinical practice 

A. Sankar^ S. R. Johnson^-^, w. S. Beattie^ G. Taif* and D. N. Wijeysundera^.'^.s* 

^ Faculty of Medicine and ^ Institute of Health Policy Management and Evaluation, University of Toronto, Toronto, Ontario, Canada 

^ Division of Rheumatology, Department of Medicine, Toronto Western Hospital, Mount Sinai Hospital, and University of Toronto, Toronto, 

Ontario, Canada 

Department of Anesthesia, Toronto General Hospital and University of Toronto, Toronto, Ontario, Canada 
^ Li Ka Shing Knowledge Institute of St Michael's Hospital, Toronto, Ontario, Canada 

* Corresponding author: Department of Anesthesia and Pain Management, Toronto General Hospital, EN 3-450, 200 Elizabeth Street, 
Toronto, ON, Canada MSG 2C4. E-mail: d.wijeysundera@utoronto.ca 




Editor's key points 

• The ASA physical status 
classification was 
designed as a nneasure of 
preoperative health 
status, not operative risk. 

• This study found good 
agreement with how 
different anaesthetists 
rate a patient's ASA 
classification. 

• This study used 
psychometric methods to 
show that the ASA 
classification is an 
indicator of perioperative 
risk. 



Background. Previous studies, which relied on hypothetical cases and chart reviews, have 
questioned the inter-rater reliability of the ASA physical status (ASA-PS) scale. We therefore 
conducted a retrospective cohort study to evaluate its inter-rater reliability and validity in 
clinical practice. 

Methods. The cohort included all adult patients (> 18 yr) who underwent elective non-cardiac 
surgery at a quaternary-care teaching institution in Toronto, Ontario, Canada, from March 
2010 to December 2011. We assessed inter-rater reliability by comparing ASA-PS scores 
assigned at the preoperative assessment clinic vs the operating theatre. We also assessed 
the validity of the ASA-PS scale by measuring its association with patients' preoperative 
characteristics and postoperative outcomes. 

Results. The cohort included 10 864 patients, of whom 5.5% were classified as ASA 1, 42.0% as 
ASA II, 46.7% as ASA III, and 5.8% as ASA IV. The ASA-PS score had moderate inter-rater 
reliability (k 0.61), with 67.0% of patients (n=7279) being assigned to the same ASA-PS 
class in the clinic and operating theatre, and 98.6% (n=10 712) of paired assessments 
being within one class of each other. The ASA-PS scale was correlated with patients' age 
(Spearman's p, 0.23), Charlson comorbidity index (p=0.24), revised cardiac risk index 
(p=0.40), and hospital length of stay (p=0.16). It had moderate ability to predict in-hospital 
mortality (receiver-operating characteristic curve area 0.69) and cardiac complications 
(receiver-operating characteristic curve area 0.70). 

Conclusions. Consistent with its inherent subjectivity, the ASA-PS scale has moderate inter- 
rater reliability in clinical practice. It also demonstrates validity as a marker of patients' 
preoperative health status. 
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The ASA physical status (ASA-PS) scale is commonly used to sub- 
jectively estimate preoperative health status. While originally 
created for statistical data collection and reporting in anaesthe- 
sia,^ it is now used for allocating resources,^ reimbursing anaes- 
thesia services,^ and predicting perioperative risk.''"^^ 

Inter-rater reliability is important when assessing the 
ASA-PS.^*" Most reliabilitystudiesofthe ASA-PS involved different 
anaesthesiologists rating hypothetical case scenarios. These 
studies found only fair inter-rater agreement (k 0.21-0.4), 
thus raising concerns about the scale's reliability.^''"^^ There 
has been little evaluation of its reliability in clinical practice. In 



a multicentre study involving 1357 anaesthesia records, the 
ASA-PS score assigned by the responsible anaesthesiologist 
had moderate agreement (k 0.53) with the score assigned by 
another blinded anaesthesiologist who had reviewed a dupli- 
cate version of the same medical record.^^ A similar single- 
centre study of 430 paediatric anaesthesia records found 
low-to-moderate reliability (k 0.43).^^ 

Given the paucity of relevant data, we undertook a cohort 
study to characterize the reliability and validity of the ASA-PS 
scale in clinical practice. The primary objective was to evaluate 
the inter-rater agreement of ASA-PS scores assigned at 
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outpatient preoperative assessment clinics vs operating thea- 
tres. The secondary objectives were to assess the scale's valid- 
ity as a nneasure of health status by measuring its association 
with patient characteristics, validated predictive indices 
[Charlson comorbidity index^^ and revised cardiac risk index 
(RCRI)],^^ hospital stay, complications, and mortality. 

Methods 

After research ethics board approval, we conducted a retro- 
spective cohort study of consecutive adults aged >18 yr who 
underwent elective non-cardiac surgery from March 2010 to 
December 2011 at the University Health Network (Toronto, 
Ontario, Canada), a quaternary care medical centre offering 
all adult surgical services except trauma and obstetrics. The 
cohort included all individuals who underwent elective non- 
cardiac surgery within 30 days after outpatient assessment 
at the institutional preoperative assessment clinics. The re- 
search ethics board waived the requirement for written 
informed consent for this study. 

Data sources 

At the preoperative clinics, nurses document histories using a 
structured electronic questionnaire (Clinical Anesthesia Infor- 
mation System PreOp Clinic, Adjuvant Informatics, Flambor- 
ough, Ontario, Canada) that captures age, sex, comorbidities, 
and medications in a linkable data set.^'' Each record includes 
an ASA-PS score assigned by the anoesthesiologist in the clinic 
(Table 1). Case records from the clinic database were linked to 
the Enterprise Electronic Data Warehouse (EDW), which cap- 
tures all information recorded by the hospital electronic charting 
system (MISYS ERR; Quadramed Corporation, Reston, VA, USA). 
The EDW includes information on surgeries, laboratory tests, 
in-hospitol medications, hospital length-of-stay, in-hospital 
mortality, and International Classification of Diseases 10th 
Revision (ICD-10) diagnostic codes. Documented surgical infor- 
mation includes on ASA-PS score assigned by the anoesthesiol- 
ogist in the operating theatre. 

The primary variables of interest were ASA-PS scores 
assigned in the preoperative clinics and operating theatres. 
Patients' age, sex, surgery, preoperative creatinine concentra- 
tion, hospital length of stay, in-hospital 30 day mortality, and 

Table 1 Description of ASA-PS classes 



postoperative myocardial injury (troponin I concentration 
exceeding 0.30 |xg litre"^) were captured from the EDW. We 
ascertained specific comorbidities using the clinic data set 
(hypertension, coronary artery disease, heart failure, diabetes, 
cerebrovascular disease, asthma, chronic obstructive pulmon- 
ary disease) and EDW (Charlson comorbidity index). We 
calculated the RCRI score using information from the EDW (sur- 
gical procedure and preoperative creatinine concentration) 
and clinic data set (other comorbidities).^^ 

Contextual factors 

Several factors should be considered when comparing ASA-PS 
ratings in operating theatres vs preoperative clinics at the Uni- 
versity Health Network. First, it is possible that an individual 
patient received core from the some anoesthesiologist in the 
clinic and operating theatre. Such scenarios were uncommon 
given the ~ 65 consultant anoesthesiologists at the institution. 
Secondly, anoesthesiologists in operating theatres were not 
blinded to ASA-PS assessments performed in the clinics. Blind- 
ing was not feasible since clinic assessments ore part of routine 
clinical care. Nonetheless, anoesthesiologists in operating 
theatres typically pay little attention to the clinic rating, 
which is reported as a single non-highlighted line in an exten- 
sive computer-generated report. Tiiirdly, anoesthesiologists in 
operating theatres (but not clinics) receive financial premiums 
from the government health insurance plan to provide anaes- 
thetic core to ASA-PS class III or class IV patients. These pre- 
miums were paid by to the anoesthesiologists' group practice 
plan, and hence shared among oil its members. 

Analysis 

Analyses were performed using STATA version 13 (StataCorp 
Inc., Lokeway, TX, USA) and the R statistical language. A two- 
tailed P-value of <0.05 was used to define statistical 
significance. 

Reliability refers to the reproducibility of an instrument, with 
inter-roter reliability referring to the application of the ASA-PS 
scale to the some group of patients by different raters. We mea- 
sured agreement of ASA-PS ratings assigned in the clinic vs op- 
erating theatre using the intro-closs correlation coefficient (ICC) 
and Cohen's weighted K.Londis and Koch^^ characterize reliabil- 
ity statistic values of 0-0.20 as 'slight', 0.21-0.4 as 'fair', 0.41- 
0.60 OS 'moderate', 0.61-0.80 as 'substantial', and values 
exceeding 0.80 as 'almost perfect'. McHorney and Torlov^® 
hove also suggested that the ICC for measures applied to 
individual patients should exceed 0.90. We conducted two sen- 
sitivity analyses. First, we re-calculated the ICC after excluding 
successive randomly selected patients whose raters agreed on 
ASA-PS classification. This analysis assessed how the lock of 
blinding impacted our results. This process was repeated until 
the ICC approached values reported by previous blinded 
studies.^^ ^^"^^ Secondly, we modified our study cohort such 
that the number of patients who were 'up-coded' in o financially 
advantageous manner (i.e. from class II to III, or from class III to 
IV) was equal to the number 'down-coded' in a financially 
disadvantageous manner (i.e. from class III to II, or from class 



ASA-PS 


Description 


class 




Class I 


A normal healthy patient 


Class II 


A patient with mild systemic disease 


Class III 


A patient with severe systemic disease 


Class IV 


A patient with severe systemic disease that is a 
constant threat to life 


Class V 


A moribund patient who is not expected to survive 
without operation 


Class VI 


A declared brain-dead patient whose organs are 
being removed for donation 
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IV to III). Any excess 'up-coded' patients were classified instead 
OS having identical ratings in the clinic and operating theatre. 
The ICC was re-calculated in this hypothetical cohort to assess 
the influence of financial incentives foranaesthesiologists in op- 
erating theatres to assign patients to ASA-PS class III or IV. 

We categorized each patient as being assigned (i) the same 
ASA-PS score assigned in the clinic and operating theatre, (ii) a 
lower score in the operating theatre, and (iii) a higher score in 
the operating theatre. We compared the characteristics of 
the categories using the x test for categorical variables, and 
analysis of variance or the Kruskal- Wallis test for continuous 
variables. We also used multivariable logistic regression to de- 
termine the adjusted association of patient and surgery 
characteristics with inter-rater disagreement. The dependent 
variable was ony disagreement in ASA-PS scores, while the pre- 
dictor variables included age, surgery, and comorbidities. In 
the primary analysis, individual comorbidities were considered 
as separate predictor variables, while a sensitivity analysis 
instead considered the total number of concurrent systemic 
diseases as a predictor variable. 

In the primary analysis, we assessed the validity of ASA-PS 
ratings in the clinic, while ratings in operating theatres were 
assessed in a secondary analysis. Both construct and criterion 
validity were evaluated. Cor\sX.ruct validity refers to whether the 
ASA-PS scale behaves like a measure of preoperative physical 
status.^^ For example, individuals with poorer physical status 
are likely to be older and have more comorbidity. We used de- 
scriptive statistics to characterize strata defined by ASA-PS 
rating. Categorical variables were described using counts and 
proportions, while continuous variables were described using 
means, standard deviations, medians, and inter-quartile 
ranges. We compared characteristics of these strata using 
the test for categorical variables, and analysis of variance 
or the Kruskal- Wallis test for continuous variables. The correl- 
ation of ASA-PS rating with age was further assessed using 
Spearman's p. 

We also evaluated the criterion validity of the ASA-PS scale, 
which consists of concurrent and predictive validity. Concurrent 
validity refers to whether the scale correlates with other indices 
of health status measured at approximately the same time. 
Spearman's p was used to assess the correlation of ASA-PS 
ratings with the Charlson comorbidity index and RCRI scores. 
Predictive validity describes whether the ASA-PS scale predicts 
future-related events. For example, patients with poor health 
status are more likely to suffer postoperative morbidity and 
mortality. We used the area-under-the-curve (AUC) of the 
receiver-operating characteristic curve to separately measure 
the discrimination of ASA-PS ratings for the outcomes of 
in-hospital 30 day mortality and myocardial injury. Additionally, 
the correlation of ASA-PS ratings with hospital length-of-stay 
was measured using Spearman's p. 

We used all available data from our databases within the 
study time frame (March 2010-December 2011). To place the 
available sample size in context, we estimated the sample 
size required to measure a plausible degree of inter-rater reli- 
ability with acceptable precision. The sample size required to 
measure an ICC of 0.41 (moderate agreement) with a lower 
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two-sided 95% confidence interval (CI) excluding an ICC of 
0.21 (fair agreement) with 90% power was 175. 

Results 

The cohort consisted of 10 864 patients (Table 2), of whom 5.5% 
(n=602) were assigned to ASA-PS class I, 42.0% (n=4562) 
assigned to class II, 46.7% (n=5073) assigned to class III, and 
5.8% (n=627) assigned to class IV in the preoperative clinic. 

Inter-rater reliability 

The agreement between ASA-PS scores assigned in the pre- 
operative clinic vs operating theatre is presented in Table 3 
and Figure 1. Approximately 67% of individuals (n=7279) 
were assigned to the same ASA-PS class in the clinic and oper- 
ating theatre, while 98.6% (n=10 712) of paired assessments 
were within one ASA-PS class of each other. Approximately 
21% (n=2245) were assigned to a higher ASA-PS class in the 
operating theatre, while 12% (n=1340) were assigned to a 
lower class. Inter-rater reliability measured by the one-way 
ICC was 0.61 (95% CI, 0.60-0.62), while the weighted ^statistic 
was 0.61 (95% CI, 0.60-0.62). The calculated ICC approached 
values seen in prior unblinded studies if one-third to one-half 
of all cases of inter-rater agreement were excluded (Supple- 
mentary Fig. SI). When the study cohort was modified to 
remove the effects of any financial incentives for ASA-PS clas- 
sification (Supplementary Table SI), the re-calculated ICC 
increased to 0.68 (CI, 0.67-0.69). 

In unadjusted analyses, inter-rater disagreement was asso- 
ciated with patient age, surgery, specific comorbidities (i.e. cor- 
onary artery disease, peripheral vascular disease, hypertension, 
asthma, cancer), and Charlson comorbidity index scores 
(Table 4). After multivariable adjustment, factors significantly 
associated with inter-rater disagreement were age, surgical pro- 
cedure, hypertension, and malignancy (Table 5). Surgical proce- 
dures that were significantly less likely to be associated with 
inter-rater disagreement were general surgery, neurosurgery, 
orthopaedic, and urological procedures. In a sensitivity analysis, 
increased burden of comorbidity was associated with lower 
odds of inter-rater disagreement (Supplementary Table S2). 

Validity 

The ASA-PS classes assigned in the clinic differed significantly 
with respect to age, sex, surgery, comorbidities, and composite 
comorbidity index scores (Table 2). In general, individuals in 
higher ASA-PS classes were likely to be older males with more 
comorbid disease and higher composite comorbidity index 
scores. These same individuals had longer stays in hospital, 
and also higher risks of postoperative mortality and myocardial 
injury (Table 2). The ASA-PS rating in the clinic was correlated 
with age (Spearman's p, 0.23; CI, 0.21-0.25), Charlson co- 
morbidity index score (Spearman's p, 0.24; CI, 0.22-0.26), and 
RCRI score (Spearman's p, 0.40; CI, 0.38-0.42). The rating had 
moderate discrimination for predicting 30 day in-hospital mor- 
tality (AUC 0.69; CI, 0.62-0.76) and myocardial injury (AUC 
0.70; CI, 0.65-0.75). It was weakly correlated with hospital 
length of stay (Spearman's p, 0.16; CI, 0.15-0.18). 
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Table 2 Characteristics of study cohort, stratified by ASA-PS rating. ENT, ear- nose-throat; so, standard deviation. *Defined as preoperative dialysis 
requirement or preoperative creatinine concentration exceeding 176 jjimol litre"^ (2.0 mg dl"^) 





ASA-PS rating assigned in the preoperative assessment clinic 




P-value 


I (n = 602) 


II (n = 4562) 


TT¥ 1^ r/\T5\ 

III (n = 5073) 


IV (n = 627) 


Patient characteristics 












Age (yr), mean (range) 


44.0(18-99) 


58.0(18-95) 


61.0(18-95) 


n /1 Q Q/. \ 


^ n nm 


Male sex 


290 (48.2%) 


2107 (46.2%) 


2591 (51.1%) 






Surgical service 












ENT surgery 


95 (15.8%) 


656 (14.4%) 


514 (10.1%) 


OQ /I C CO/ \ 




General surgery 


105 (17.4%) 


590 (12.9%) 


998 (19.7%) 


1 1 Q /I Q QO/ \ 




Gynaecology 


49 (8.1%) 


416 (9.1%) 


367 (7.2%) 


46 (7.3%) 




Neurosurgery 


11 (1.8%) 


271 (5.9%) 


592 (11.7%) 


zU {5.2.%} 




Ophthalmology 


0 (0.0%) 


2 (0.0%) 


2 (0.0%) 


U \\)X} /a) 


^n nm 


Orthopaedic surgery 


126 (20.9%) 


1330 (29.2%) 


1179 (23.2%) 


27 {4.3%) 




Plastic surgery 


35 (5.8%) 


169 (3.7%) 


80 (1.6%) 


f- fc\ 00/ \ 




Thoracic surgery 


13 (2.2%) 


257 (5.6%) 


585 (11.5%) 


111 /"l Q DO/ \ 




Urology 


168 (27.9%) 


840 (18.4%) 


565 (11.1%) 


r ~j ir\ 1 0/ \ 




Vascular surgery 


0 (0.0%) 


31 (0.7%) 


191 (3.8%) 


1 O IT /T 1 CO/ \ 
lib [2.1.D%) 




Comorbid disease 












Coronary artery disease 


5 (0.8%) 


316 (6.9%) 


992 (19.6%) 


223 (35.5%) 


<0.001 


Congestive heart failure 


1 (0.2%) 


9 (0.2%) 


121 (2.4%) 


■70 /1 1 / 0/ \ 


<0.001 


Peripheral vascular disease 


0 (0.0%) 


32 (0.7%) 


163 (3.2%) 


QD /1 C 00/ \ 

yy (ib.o%j 


<0.001 


Cerebrovascular disease 


2 (0.3%) 


86 (1.9%) 


376 (7.4%) 


or\ 11 c on/ \ 

33 (15.8%) 


<0.001 


Hypertension 


16 (2.7%) 


1574 (34.5%) 


2736 (53.9%) 


/.no /c c TO/ \ 
HUy [0D.2 /o) 


<0.001 


Diabetes 


2 (0.3%) 


386 (8.5%) 


1143 (22.5%) 


1 0 IT /T D CO/ \ 

lob {23.D%) 


<0.001 


Renal insufficiency* 


0 (0%) 


5 (0.1%) 


118 (2.3%) 


47 (7.5%) 


<0.001 


Chronic obstructive pulmonary disease 


0 (0%) 


132 (2.9%) 


430 (8.5%) 


1 1 /-J CO/ \ 

IIU (/.b%J 


<0.001 


Asthma 


25 (4.2%) 


389 (8.5%) 


629 (12.4%) 


cc /I r\ CO/ \ 

00 (l(J.b%) 


<0.001 


Rheumatic disease 


0 (0.0%) 


27 (0.6%) 


87 (1.7%) 


14 (2.3%) 


<0.001 


Peptic ulcer disease 


0 (0.0%) 


17 (0.4%) 


36 (0.7%) 


8 (1.3%) 


0.003 


Liver disease 


4 (0.7%) 


70 (1.5%) 


168 (3.3%) 


35 (5.6%) 


<0.001 


Cancer 












Primary disease 


118 (19.6%) 


1364 (29.9%) 


1331 (26.2%) 


17") ITJ /.O/ \ 


^n nm 


Metastatic disease 


33 (5.5%) 


507 (11.1%) 


790 (15.6%) 


1 /on 1 o/^\ 

iZO \£.\J.L /0} 




Comorbidity indices 












Charlson comorbidity index, mean (so) 


0.83 (1.79) 


1.59 (2.37) 


2.28 (2.70) 


3.25 (2.78) 


<0.001 


Revised cardiac risk index, mean (so) 


0.21 (0.42) 


0.36 (0.57) 


0.88 (0.88) 


1.61 (1.13) 


<0.001 


Outcomes 












Postoperative myocardial injury 


0 (0.0%) 


19 (0.4%) 


56 (1.1%) 


28 (4.5%) 


<0.001 


30 day in-hospital mortality 


0 (0.0%) 


11 (0.2%) 


25 (0.5%) 


15 (2.4%) 


<0.001 


Hospital length of stay (mean, so) 


3.0 (2.7) 


4.0 (5.3) 


6.0 (8.9) 


8.0 (11.2) 


<0.001 



In a secondary analysis, ASA-PS ratings in the operating 
theatre hod higher correlations with age (Spearman's p, 0.28; 
CI, 0.26-0.29), Charlson comorbidity index (Spearman's p, 
0.28; CI, 0.27-0.30), RCRI (Spearman's p, 0.42; CI, 0.41-0.44), 
and hospital length of stay (Spearman's p, 0.20; CI, 0.19- 
0.22). Ratings in the operating theatre had moderate ability 
to predict mortality (AUC 0.74; CI, 0.68-0.80) and myocardial 
injury (AUC 0.75; CI, 0.71-0.79). Compared with clinic ratings, 
ratings in the operating theatre differed significantly with 
respect to predicting myocardial injury (P=0.01) but not mor- 
tality (P=0.17). 



Discussion 

Given the ubiquity of the ASA-PS scale in clinical practice, it is 
important to define its reliability and validity. In this large 
single-institution study, the ASA-PS scale had moderate inter- 
rater reliability, despite its inherent subjectivity. Furthermore, it 
demonstrated validity as a measure of preoperative health 
status, showing expected patterns of association with 
patient characteristics and postoperative outcomes. 

Poor reliability has been among the largest criticisms of the 
ASA-PS scale.^''"^^ For example, a previous study relying on 
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Table 3 Agreement between ASA-PS ratings in the preoperative assessment clinic vs operating tPieatre 



ASA-PS rating In the operating theatre 


ASA-PS rating assigned In the preoperative assessment clinic 




ASA I (n = 602) 


ASA II (n = 4562) 


ASA III (n = 5073) 


ASA IV (n = 627) 


ASA I (n= 515) 


285 (47.3%) 


201 (4.4%) 


28 (0.6%) 


1 (0.2%) 


ASAII(n=3905) 


264 (43.9%) 


2814 (61.7%) 


807 (15.9%) 


20 (3.2%) 


ASA III (n=5689) 


52 (8.6%) 


1497 (32.8%) 


3857 (76.0%) 


283 (45.1%) 


ASA IV (n=755) 


1 (0.2%) 


50 (1.1%) 


381 (7.5%) 


323 (51.5%) 



80% 



70% 



60% 



50% 



40% 



30% 



20% 



10% 



ASA I in tlie operating tiieatre 
ASA II in tiie operating tlieatre 
ASA ill in tiie operating tlieatre 
ASA IV in tiie operating tlieatre 




ASA I ASA II ASA III ASA IV 

ASA-PS rating in the preoperative assessment clinic 



Fig 1 Distribution of ASA-PS ratings in the operating theatre, within strata defined by ASA-PS rating in the preoperative assessment clinic. 



hypothetical case scenarios found only fair inter-rater agree- 
ment (k 0.21-0.4).^^ Another study found moderate Inter-rater 
agreement (k 0.53) when Instead comparing ratings by the 
responsible anaestheslologlst vs a different blinded anaesthe- 
slologlst reviewing the same medical record.^^ In contrast to 
previous work, our study evaluated the Inter-rater reliability of 
the ASA-PS scale in 'real-world' clinical practice. Since we com- 
pared ASA-PS ratings performed by two anaestheslologists 
involved In the clinical care of the same patient, both raters 
had the opportunity to interview, physically examine, and par- 
ticipate in clinical decision-making. This Increased degree of 
clinical engagement may have explained. In part, the higher 
observed Inter-rater reliability. This degree of inter-rater agree- 
ment is remarkable for a subjective rating scale, with 67% of 
patients being assigned the same ASA-PS score, and almost 
99% being assigned scores within one ASA-PS class of each 
other. 



Despite the Increased degree of Inter-rater reliability In our 
present study, the ICC (0.61) and weighted k (0.61) still dec- 
reased below the minimum of 0.90 recommended by McHorney 
and Tarlov.^® The absence of high Inter-rater reliability Is also 
not surprising. There is inherent subjectivity to differentiating 
between patients with 'mild systemic disease', 'severe systemic 
disease', and 'severe systemic disease that Is a constant threat 
to life', especially in the absence of a 'moderate systemic 
disease' category or further standardized Information to help 
define the current existing categories. 

We Identified several factors associated with Inter-rater dis- 
agreement, namely age, surgery, hypertension, malignancy, 
and comorbidity burden. Age has been previously noted as a 
source of disagreement In ASA-PS ratlngs,^° especially since 
there are no guidelines on how patients' age should be consid- 
ered when assigning ASA-PS scores. Nonetheless, the associ- 
ation between age and inter-rater disagreement In our study 
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Table 4 Characteristics of categories defined by level of inter-rater agreement for ASA-PS rating. ENT, ear- nose -throat; so, standard deviation. 
'Defined as preoperative dialysis requirement or preoperative creatinine concentration exceeding 176 (imol litre"^ (2.0 mg dr^) 





Lower ASA-PS rating 
in operating theatre 
[n = 1340 (12.3%)] 


No change in ASA-PS 

■•nl-inn Tn 7T7Q /C7 HO/ \1 

rating l"— /z/y to/.uvojj 


Higher ASA-PS rating 
in operating theatre 
[n = 2245 (20.7%)] 


P-value 


Patient characteristics 










Age (yr), mean (range) 






CQ D M Q aQ\ 


<U.UUi 


Male sex 


CQ1 /I ") CO/ \ 


3d T /CC Q0/,\ 

jbiz ^bb.y /oj 


1 1 1 n /Tn co/_\ 




Surgical service 










ENT surgery 


TCI /I Q TO/ \ 

^bi [L^.Z /o) 


QiT^ /CO DO/ \ 

iSU3 pcS.y /oj 


TQD /T1 QO/ \ 

2yy (2 i.y voj 




General surgery 


^Ub [LlA /o) 


1 T T Q /CO 7.0/ \ 


TC"7 /Tn 0 0/\ 

36/ (20.3 voj 




Gynaecology 


All C / 0/ \ 


/on /CC T0/\ 

4ob (bb.z%) 


T/D /TO /0/\ 

24y l2o.4%) 


1 
1 


Neurosurgery 


74 (8.3%) 


676 (75.6%) 


144 (16.1%) 




Ophthalmology 


1 /Tc no/ \ 


T / c n no/ \ 
z (dU.U voj 


1 / T c no/ \ 
1 (25.0 voj 




Orthopaedic surgery 


T/. /. /Q TO/ \ 


Tnc^ Z"?*? CO/ \ 
ZUb3 [/ /.D /o) 


T C C /I 0 00/ \ 

355 (13.3%J 


..- rt nni 
<0.0U 1 


Plastic surgery 


43 (14.7%) 


172 (59.5%) 


74 (25.6%) 




Thoracic surgery 


1 TQ /I 0 TO/ \ 
12y (13. Z%) 


coo /CD "70/ \ 

5o3 (59./%) 


TC/ /T*7 10/\ 

264 (2/.l7o) 


1 


Urology 


Tm /I T /.o/ \ 


in/.Q /C/. 7.0/ \ 

iU4y (64.4 /oj 


T "70 /TO 007 \ 

3/y (23.3toJ 


1 


Vascular surgery 


36 (10.1%) 


tno /ro Tn/\ 

208 (58.3%) 


iiT /T1 ~ini \ 

113 (31.7%) 




Comorbid disease 










Coronary artery disease 


■1 "7 "7 /I 1 CO/ \ 

1// (11.3%) 


1 n"7"7 /CD 00/ \ 

10// (69.0V0) 


TOO /I O "70/ \ 

288 (18. /vo) 


0.04 


Congestive heart failure 


TO /"IT / 0/ \ 

lo (13.4%) 


1/1 /C"7 CO/ \ 

141 (6/. 5%) 


/n /iQ io/\ 
4U (19.1%) 


0.82 


Peripheral vascular disease 


T / /O T 0/ \ 

24 (o.2%J 


1 TC /CD DO/ \ 

1 /6 (59.9%) 


Q / /o T no/ \ 
94 (32. Ovo) 


<0.001 


Cerebrovascular disease 


"7A /"IT / 0/ \ 

/U (lz.4%) 


ODO /CD C0/\ 

393 (69.6%) 


1 ni /I "7 QO/ \ 

101 (i/.y%) 


0.25 


Hypertension 


537 (11.3%) 


3277 (69.2%) 


r\i-i /-lo rn/\ 

921 (19.5%) 


<0.001 


Diabetes 


Tnn l^ i 7o/„\ 




0T7 M q 1 o/_\ 
jZ / (il). 1 /Dj 


n no 
u.uy 


Renal insufficiency* 


1 Q /I n CO/ \ 


1 1 C /CO TO/ \ 

1 16 (60.2 voj 


0C/T1 T0/\ 

36 (2 1.2 /oj 


n TQ 
O./o 


Chronic obstructive pulmonary disease 


CO /I A ■! 0/ \ 
ocS (1(J.1%) 


/CI /CO CO/ \ 

461 (68.6%) 


1/0 /T1 00/\ 

143 (21.3%) 


0.20 


Asthma 


1 cn /I /. 7.0/ \ 
ibU [Lh.h /o) 


7cn /cc co/,\ 
/bU (bcS.j /oj 


1 QD M 7 nO/_\ 

loy (1 /.u /oj 


n nriT 

U.UU2 


Rheumatic disease 


T T /I Q no/ \ 
z3 (io.Uvoj 


QC /C? TO/ \ 

Ob (6/.2 /oj 


ID /1 7. Q07 \ 

ly (14.0 /oj 


n n"7 
0.0/ 


Peptic ulcer disease 


1 n /I c 7.0/ \ 


7. T /7n co/,\ 
(/U.J /oj 


Q M 0 1 0/J\ 
O (13.1 /Oj 


n T7 
U.Z / 


Liver disease 


32 (11.6%) 


178 (64.3%) 


67 (24.2%) 


0.82 


Malignancy 










Primary disease 


398 (13.3%) 


1821 (61.0%) 


766 (25.7%) 


<0.001 


Metastatic disease 


207 (14.2%) 


881 (60.5%) 


368 (25.3%) 




Comorbidity indices 










Charlson comorbidity index, mean (so) 


2.14 (2.68) 


1.82 (2.50) 


2.31 (2.72) 


< 0.001 


Revised cardiac risk index, mean (so) 


0.64 (0.84) 


0.67 (0.85) 


0.68 (0.83) 


0.30 



I 



should be viewed cautiously since its statistical significance 
was not strong. Additionally, the association did not follow a 
logical pattern, such as increasing inter-rater disagreement 
at the extremes of age. Surgical procedure has also previously 
been identified as a source of inter-rater disagreement.^^ ^® For 
example, Haynes and Lawler^® found that anaesthesiologists 
assigned patients undergoing minor surgical procedures to 
lower ASA-PS classes than would be otherwise expected, 
even when the patients had serious medical disease. The 
influence of surgical procedure on inter-rater disagreement is 
likely driven by misunderstanding of the ASA-PS classification 
system, which was developed to measure preoperative health 
status, not operative risk. Indeed, in his original paper, Soklad^ 
stated that the ASA-PS grade had 'no relation to the operative 
procedure, the ability of the surgeon or anesthetist, nor the 



type of anesthesia the patient will receive'. Nonetheless, many 
anaesthesia providers still consider the ASA-PS scale an anaes- 
thetic risk predictor.^'' The association of specific comorbidities 
with inter-rater disagreement in our study has some consist- 
ency with previous reseorch.^^ In addition, our results suggest 
that clinicians are less likely to agree on how some medical con- 
ditions (e.g. cancer) impact on preoperative physical status, but 
more likely to agree on the impact of the total burden of 
comorbidity. 

In addition to evaluating the reliability of the ASA-PS scale, 
we assessed its construct, concurrent, and predictive validity.^^ 
The scale showed construct validity, based on fair correlation 
with patient age, and an increased burden of comorbidities in 
patients with higher ASA-PS scores (Table 2). Our findings 
confirm previous work, such as a single-centre study showing 
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Table 5 Adjusted association of patient cliaracteristics witii disagreement in ASA-PS class rating. ENT, ear- nose-tiiroat. 'Defined as preoperative 
dialysis requirement or preoperative creatinine concentration exceeding 176 jjumol litre"^ (2.0 mg dl~^) 



Adjusted odds ratio for 
disagreement in ASA rating 



95% confidence Interval 



P-value 



Patient characteristics 

Male sex 
Age 

40yror less 

41-60yr 

61-80 yr 

81 yr or more 
Surgical service 

ENT surgery 

General surgery 

Gynaecology 

Neurosurgery 

Opiithalmology 

Orthopaedic surgery 

Plastic surgery 

Thoracic surgery 

Urology 

Vascular surgery 
Comorbid disease 

Coronary artery disease 
Congestive heart failure 
Peripheral vascular disease 
Cerebrovascular disease 
Hypertension 
Diabetes 

Renal insufficiency* 

Chronic obstructive pulmonary disease 

Asthma 

Rheumatic disease 
Peptic ulcer disease 
Liver disease 
Malignancy 
Primary disease 
Metastatic disease 



1.04 

Reference category 

0.84 

0.94 

0.89 

Reference category 

0.66 

1.20 

0.46 

1.49 

0.42 

0.99 

0.97 

0.88 

1.13 

0.89 
1.04 
1.02 
0.91 
0.86 
0.95 
0.91 
0.88 
1.01 
1.25 
0.92 
1.16 

1.20 
1.19 



0.95-1.14 



0.74-0.96 
0.82-1.08 
0.72-1.10 



0.57-0.77 

1.00-1.43 

0.38-0.55 

0.21-10.67 

0.36-0.48 

0.76-1.28 

0.82-1.15 

0.75-0.92 

0.82-1.58 

0.78-1.00 
0.76-1.41 
0.72-1.43 
0.75-1.10 
0.78-0.94 
0.84-1.07 
0.65-1.27 
0.74-1.05 
0.88-1.15 
0.86-1.84 
0.52-1.62 
0.89-1.50 

1.07-1.34 
1.04-1.37 



0.40 



0.03 



<0.001 



0.06 

0.83 

0.92 

0.33 

0.001 

0.42 

0.58 

0.15 

0.93 

0.25 

0.77 

0.27 

0.002 



Strong interdependence between ASA-PS ratings and National 
Surgical Quality Improvement Program clinical risk factors.^ 
The ASA-PS scale also exhibited concurrent validity. It was cor- 
related with more 'objective' comorbidity indices such as the 
Charlson comorbidity index, and RCRI. Notably, the correlation 
of ASA-PS scores with the Charlson comorbidity index was only 
slight-to-fair in magnitude. The relatively poor correlation may 
be explained by the subjectivity of the ASA-PS scale, and differ- 
ences in how the two scales were developed. The ASA-PS scale 
was intended to measure preoperative physical status, while 
the Charlson comorbidity index was developed to measure 
risks of 1 yr mortality in medical inpatients.^ These findings 
with respect to correlation with other comorbidity measures 
ore consistent with previous research, such as a single-centre 
study showing correlation of the ASA-PS scale with the 



Neurological, Airway, Respiratory, Cardiovascular, and Other 
model of risk assessment in children. 

Our study confirmed the predictive validity of the ASA-PS 
scale. Even when assessed well before surgery in an outpatient 
clinic, the scale had moderate ability to predict postoperative 
mortality and cardiac complications. By comparison, its correl- 
ation with hospital length of stay was relatively weak, likely 
because hospital length of stay is influenced by many distinct 
clinical factors, such as surgery type. The ability of the ASA-PS 
scale to predict adverse outcomes has previously been observed 
forspecificsurgeries,''^^^^°^^wherehigherASA-PS scores were 
associated with higher mortality rates.^'' It has also shown 
modest ability to predict postoperative cardiac 
complications,^^ ^° and been an important component of 
models designed to predict postoperative mortality and 



430 



Reliability of the ASA-PS scale 



BJA 



morbidity.^ ® This consistent demonstration of moderate pre- 
dictive validity by tlie ASA-PS scale, both in our present study and 
previous research, supports its use as a component of 
risk-adjustment models for comparing surgical outcomes 
across hospitals. The ASA-PS score is incorporated into the 
risk-adjustment model used by the Notional Surgical Quality Im- 
provement Program to measure the quality of surgical care 
across US hospitols.^^ 

Notably, we found that the ASA-PS rating in operating thea- 
tres exhibited better validity, based on higher correlations with 
age, comorbidity scores, and hospital length of stay, and also 
improved prediction of myocardial injury. In some coses, its su- 
perior predictive validity could be explained by changes in a 
patient's medical status between the clinic visit and subsequent 
surgery. Nonetheless, such cases should be rare since the cohort 
only included elective surgeries performed within 30 days of a 
preoperative clinic visit. The superior predictive validity may 
also have been due to anaesthesiologists in operating theatres 
being less 'blinded' to eventual outcomes than those in the 
clinic. For example, patients assigned to class II in the clinic 
may hove been re-assigned to class III in the operating 
theatre If they develop severe intraoperative hypotension. 
Nonetheless, such differences in 'blinding' cannot explain the 
higher correlation of ASA-PS ratings in the operating theatre 
with age and comorbidity indices. Thus, our findings indicate 
that ASA-PS ratings hove the greatest validity when assigned 
by the responsible anaesthesiologist in the operating theatre. 
The results also highlight potential limitations of using hypothet- 
ical case scenarios or reviews of medical records as models for 
evaluating the psychometric properties of the ASA-PS scale. 

Several study limitations need to be acknowledged. First, 
this was 0 retrospective cohort study from a single quaternary- 
care teaching institution, as reflected by the high proportion of 
ASA III and ASA IV patients. Similar studies at other centres 
with differing case-mixes are necessary to better generalize 
our findings. Secondly, the cohort only included patients who 
underwent elective surgery after being assessed in an out- 
patient preoperative assessment clinic. Thus, the cohort 
excluded individuals who were assigned ASA-PS class V, class 
VI, or any emergency modifier ('E') code. Our findings therefore 
cannot be extrapolated to non-elective surgical procedures. 
Thirdly, anaesthesiologists in operating theatres were not 
blinded to ASA-PS scores assigned in the clinic, thereby poten- 
tially biasing patients' second ASA-PS rating and increasing 
inter-rater reliability. Nonetheless, this limitation permitted 
both anaesthesiologists to be able to conduct a face-to-face 
assessment of patients in a manner consistent with clinical 
practice. Indeed, the increased inter-rater agreement 
observed in our present study may be a reflection of anaesthe- 
siologists being able to interview, physically examine, and par- 
ticipate in the medical care of patients, as opposed to reviewing 
hypothetical case scenarios or blinded medical charts. Further- 
more, our sensitivity analysis suggests that these results would 
only be negated if fully one-third to one-half of all coses of 
inter-rater agreement were solely attributable to the lack of 
blinding. We would propose that such a large impact is unlikely. 
FourMy, there were financial incentives to assign patients in 



operating theatres to ASA-PS classes III and IV. Nonetheless, 
since such incentives would encourage ratings in the operating 
theatre to disagree with ASA-PS scores assigned in the clinic, 
this bios would hove led to an underestimate of reliability, as 
evidenced by our sensitivity analysis. The strength of the bias 
is also likely small, as individual financial premiums would be 
considerably diluted within a group practice plan of 65 consult- 
ant anaesthesiologists. Furthermore, previous research found 
no systematic differences in ASA-PS scores assigned by anaes- 
thesiologists who used these scores for billing purposes, as 
opposed to scores assigned by anaesthesiologists who did 
not.2° 

Conclusions 

In a large single-institution cohort study, the ASA-PS scale hod 
moderate inter-rater reliability in clinical practice. The scale 
also showed validity, based on its correlation with preoperative 
characteristics and its prediction of postoperative outcomes. 
Despite the inherent subjectivity of the ASA-PS scale, our find- 
ings support its use as a measure of preoperative health status. 
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